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The classical statistics indication for the impossibility to derive quantum mechanics from classical mechan- 
ics is proved. The formalism of the statistical Fisher information is used. Next the Fisher information as a 
tool of the construction of a self-consistent field theory, which joins the quantum theory and classical field 
theory, is proposed. 
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1 Introduction It is said that classical mechanics is the stochastic limit of the quantum mechanic^. 
And vice versa, as according to von Neumann [ 1 J quantum theory is incompatible with the free of dis- 
persion ensembles existence hence it is well recognized that predicted departure from classical behavior 
of a system appears at the statistical level only [2|. Now, quite recently the language of the geometry of 
the space of the distributions has been formulated under the notion of statistical geometry and the clue to 
its description is connected with the Fisher information (FI) O matrix by which the distance between 
distributions can be defined. It (Fisher-Rao metric which is a Riemannian one) is used in the definition 
of the relative entropy of two infinitesimally different distribution^ being also the Hessian matrix of the 
Shannon entropy HE]. Finally it is related to the notion of the statistical FI which (in the opposition 
to the global Shannon entropy) characterizes local properties of the probability distribution |6|. In the 
statistically orientated theory of the measurement and estimation with the A" dimensional sample, the FI 
characterizes the local properties of the likelihood function p(y; 6\, On) which formally is the joint 
probability (density) of the data y = (yi, ...,yjv) but is treated as the function of parameters 6 n . The 
set = ((9i, On) of the parameter^ are the coordinates in the distribution space in which the distance 
between distributions is defined. 

During the course of the paper it will be shown that the statistical FI is a right tool to address two mutu- 
ally related problems. The first one (Section 2) is connected with the statistical proof of the impossibility 
to derive quantum mechanics from classical mechanics and the second one (Sections 2 and 3) meets the 
consistency problem of the self-field formalism which is used in such different branches of the physical 
research as the superconductivity [7 1, atomic or particle physics and astrophysics (8). 



2 The proof Below, in order to prove that quantum mechanics is not to be derived from classical me- 
chanics, the methods of quantum mechanics and classical statistics are compared. 



Yet it would be better to say that classical mechanics has the symplectic (manifold) structure and not the statistical one. 
^ That is the infinitesimal version of the Kullback-Leilbler relative entropy |9], which is the tool for making the comparisons 
between the models, especially in the time series analysis. 

^ In the most general case of the estimation procedure the dimensions of y and are usually different. 
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The maximum likelihood (ML) method is connected with the analysis of the first derivative in the parame- 
ters of the Zn-likelihood function. Its second derivative with the minus sign leads to the (observed) FI [9 1. 
So, the relation between the ML principle and the minimal statistical information (and therefore maximal 
entropy (ME)) principle is not at all obvious. Let y be the vector of position and = (9i, On) the set 
of ML estimates of the vector parameter 9. In the case of two infinitesimally close distributions p(y) and 
p(y + Ay), the relation between the (expected) FiQ, defined j£| as 

n—1 '* n—1 

and the Kullback-Leibler entropy G [9] for two infinitesimally different distributions, is as follows [6 1: 

J~ -G[p(y),p(y + Ay)] . (2) 

Hence it is the notion of the (relative) entropy which is the basic one and many of the properties of the 
entropy might be rewritten into the language of information. When entropy is the measure of disorder, the 
information is the measure of order. 

Now, the quantum mechanical (q.m.) analog of the ML estimators are the operators, i.e. observables, 
and the wave function which is the basic quantity in the Schrodinger quantum (wave) mechanics is the 
carrier of the full information on the eigenvalues of these operators. Parameters of (the distribution of) the 
random variable are the analogs of the eigenvalues. Hence it might be noticed that the quantum analog of 
the statistical random variable distribution ought to be the quantum mechanical wave function. 
The concussion follows that quantum mechanics is the statistical methods. In statistics the sample is con- 
nected with the random choice of some collection of N states from the whole population of state collections 
(i.e. from the sample space). The "most probable" values (estimates) for the population parameters are 
those which maximize the joint probability of the data. This procedure of choosing by maximization a 
particular distribution of states is described under the name of the maximum likelihood principle (MLp). 
Yet the ML method is not accomplished by the maximization procedure of the likelihood as usually a lot 
of distributions in the distribution space remain. Hence the finale procedure is connected with further in- 
vestigation of the shape of the distribution |9|, which is crucial, as bigger the Fisher information (about 
the values of the parameters) is the narrower the distribution is also. The question arises what value should 
be chosen by the Fisher information during the inquiry of the possible shape of the distribution but to 
answer it a new criterion is needed. Proper physical models are to arise as the consequences of a new 
information principle (IP). It will be stated later on but already the feeling should be that the ME principle 
(MEp) intervenes somehow. It should be stressed that just mentioned IP (which should have the physical 
background) might choose the estimators which are inefficient The flow diagram below summarizes the 
discussed analog, on which inspection the question arises. Is e.g. the Schrodinger q.m. the consequence of 
the statistical IP, or is it doing the choice of the distribution from the start by yourself? 

MLp — > ML estimators — > parameters <— variable distribution <— (1 "?") IP <— MEp 

I I I I 

Operators — > expected values <— wave function J. 

I I I 

Heisenberg' s q. m. Schroedinger' s q. m. <— «— <— (2) 

Furthermore, analyzing the form of the statistical FI of the system, it could be noticed [6 1 that it may be 
written in real amplitudes q n as follows: 

I = 4 / dx„ [ ) , where q£(x„) = p Xn (xn) ■ (3) 



It could be proved that under the regularity conditions the expected FI is the variance of the gradient dlnp(0)/80 |9 |. 
5 In connection to the Cramer-Rao inequality e 2 I > 1 (where e 2 is the mean-square error of the estimate from the true value i 
the parameter), the maximal inverse of the possible value of e 2 is called the channel capacity being equal to the Fisher information 
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Here p Xn (x n ) is the probability distributioif] which has the property of the shift invariance i.e., p Xn (xn) = 
Px n (x-n\8n) = Pn{yn \@n) with x n = y n — n , where (9 n ) is the set of physical quantities (parameters with 
unknown values) of a physical nature (e.g. positions); y„ are N data values and x„ are added displacements 
(fluctuations). The transition from Eq.([T|i to Eq.© is performed under the assumptiorO that the data are 
collected independently which allows to express the joint probability p(y) via the factorization property as 
p(y) = p(y|©) = rin=i Pn{yn\6n), where 9 m has no influence on y„ for m ^ n. Now from amplitudes 
q n the wave function ijj n could be constructed as follows: 

ip n = — =(q 2n -i + iq2n) , where n = 1, 2, AT/2, (4) 
v N 



with the number of real degrees of freedom being twice the complex ones. After using the total probability 
law for all data, the probability distribution for the system (e.g. a particle) could be rewritten as p(x) 
= Hn=\ Pz„ {x n \9 n )P(9 n ) = i J2n=l 1n> wh ere we have chosen P(0 n ) = according to our lack of 
knowledge on which one of 9 n actually occurs in the n-th experiment All of these leads to 

N/2 

p(x) = <i> n (5) 

which establishes the right relation between the probability and the wave function. Let us notice that the 
shift invariance condition together with the factorization property are very important. Under them the 
information / does not depend on the parameter set (9 n ) [91, and the wave functions (|4]i do not depend on 
these parameters (e.g. positions) also. The distribution (probability law (0) is then of the form p(x) = 
|i/j(x)| 2 rather than |?/;(x|9)| 2 0. Finally the FI (O could be explicitly rewritten in the shape of the 
kinetic action term 

/ = 4Arf 2 ^x«^^, (6) 
/ ox ax 

n— 1 

where index n has been dropped from the integral as the range of all x n is the same. In this way it was 
proven J6) (at least at this stage) that quantum mechanics is indeed a statistical method, yet notice that the 
interpretation of the wave function as the classical (i.e. real) probability distribution is misleading, as in 
Eq.© we have finished with the complex wave function ip n . 

Until now we have not established the value of N, the sample size. In general, if N — * oo the ML esti- 
mators have elegant properties, they are unbiased end efficient. In classical mechanics to specify precisely 
the parameter (e.g. the position of the point-like particle) means that the infinite number N of experiments 
should be carried out. Hence in classical mechanics N in Eq.([3j goes to infinity. It has been also proved 
that the quantum mechanical models are obtained with the notion of the FI for the precise, finite values of 
N (see 0). E.g. for the Klein-Gordorfl and Schrodinger equation (as its limit) N — 2, for Dirac equation 
N = 8, for Maxwell equations N = 4. Frieden derived also the classical mechanics from the quantum 
model but as the limit case Ti — * only and it could be noticed that the value of N is irrelevant in his 
calculations. Now we see that models belong to the different cases of N. For the particular quantum or 



With the substantial or probabilistic origin. From Section 3 it follows that as the substantial origin is appropriate e.g. for the 
Maxwell electro-magnetic field, it is then appropriate for the Dirac wave function also as both of them have the same statistical origin. 

7 The chain rule d/dd„ = d/d(y n — 9 n ) <9(y n — d n )/dd n = — d/d(y n — 8 n ) = — <9/<9x n has also been used. 

* P(9n) = i has nothing to do with the Bayesian distribution of one particular parameter 8 n , but with the random choice of 
any one of it from the set (9 n ) during the course of the evolution of the system. In this respect the graphical interpretation of the 
Feynman path integrals might be useful. 

' The formalism of the FI might be easily generalized leading to the relativistically covariant equations. If e.g. x n is a vector 
(x^) then we introduce in Eq.fS) the notation {-^-) 2 = X)« dx" ' same notation for 8 n = (6^). 
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classical field model N is finished (as only eigenvalue is needed) but for classical mechanics N should be 
infinite (to have the information on the position of the classical particle at every moment of time). 
Suppose that we have a system which is described by a nonsingular distribution. Then for N — ► oo the FI 
([T} diverges to infinity. Yet the same happens for any singular distribution like the Dirac delta distribution 
also. To see it let us consider a point-like free particle at rest at the position 9 and take a (5-Dirac sequence 
of functions, e.g. the sequence of the Gauss functions <5fc(y n ) = exp(—k 2 (y n — 0) 2 ). Then, because 

for the particular index k the FI is equal to (9), where a 2 — ^p- describes the variance of the position 

^ k 

of the particle for the fc-th element in the sequence, we see that the FI diverges to infinity for N — > oo 
(and even more for k — > oo). To sum up, for N — > oo the FI does not exist whatever the distribution 
would be. 

Hence there are two classes of theories pertaining to the dimension N of the sample, i.e. N for the quantum 
mechanics (and classical field theory also) is finite whereas for the classical mechanics it is infinite which 
means that the classical mechanics has not the statistical origin. This has finished the proof that there is the 
inherent difference between quantum and classical mechanics. To my best knowledge it has not yet been 
given in this simple statistical form. The proof does not encompass the impossibility of the derivation of 
quantum mechanics (or other quantum theory) from a classical field theory (or self-consistent field theory). 
Let us notice that FI looks like the action for the kinetic energy term (Eq.©). Using this quantity (and 
new postulates on the physical information during the process of the measurement, Section 3), Frieden [6| 
derived some of the quantum mechanical models via the way (2) from the above flow diagram. 



3 Fisher information and the self-field theory The result which follows from previous Section is that 
all physical models fall into two categories. They are of the classical mechanics origin or of the statistical 
one. So, the division does not lie between what is micro or macro but what is of the statistical or classical 
mechanics origin, still better, what is of the field (wave) theory or strictly point-like origin. The conse- 
quences are as follows. Mixing classical mechanics with field theory models leads to the inconsistency as 
the one for the Lorentz-Abraham-Dirac equation which leads to the self acceleration of point-like charged 
particle which interacts with its own electromagnetic field ifTUl . From the other side combining quantum 
(wave) mechanics with classical electrodynamics is more promising. 

The question arises, are both wave (quantum) mechanics and classical field theories of the statistical origin? 
According to Section 2 there is a reason to acknowledge any quantum model, which has the resemblance 
of its kinetic action to the FI, as the statistical one (see also Q). It has been also shown [6| that the main 
classical model, that is Maxwell electrodynamics, has the same statistical structure. At this point we need 
the construction of the statistical predecessors for both the kinetic action and the structural one. From the 
statistical perspective the first one is the carrier of information about the system in the measurement but 
the second one is the carrier of information about the structure of the system which reveals itself somehow 
in the measurements scenario taking into account additional constraints 0. We need to bind both types of 
information by the new principle. As it has been said in Section 2, the predecessor for the kinetic part is 
the Fisher information /. The construction of the structural statistical term, called Q, follows the particular 
characteristics of the theory which take into account the physical parameters of the model. According to 
Eq.©, / is the function of the amplitudes q(x), so Q has to be also. Yet Q has to depend on the physical 
constants of the particular scenario also, e.g. on Ti or c. Now, because there exists the entropy for the 
kinetic term, namely the Kullback-Leibler relative entropy G with the implication G — > I (Eq.©) then 
there should exist the entropy term for Q also, let us call it Sq. 

A system which is without a structure dissolves itself hence its equation of motion requires a structural 
term and "during putting this structure upon" the entropy of the system has to be minimized and infor- 
mation maximized. Yet when the constraints had been established then from all distributions the one 
which maximizes the entropy and minimizes information should be chosen via a new variational principle 
G + Sq — > max or / + Q — > min which we call the scalar principle. It might be written in the following 
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form 

6(1 + Q) — , principle I (scalar) (7) 

and interpreted as a conservation law of the physical information K = I + Q of the system. Both / and 
Q are (final) information which exist in the system but only / reveals to the observer in the process of the 
measurement. Although Q influences /, it is lost to the observer (in the measurement) and is carried inside 
the system only. The first principle (O does not exhaust all possibilities. The intriguing thing is that a lot of 
calculations might be done for the most pessimistic scenario under which the total entropy partitions itself 
equally (or with a factor 1/2) into the Fisher and structural parts, having in total the value zero. Hence in 
practice it occurred |6| that the law ^ has to be completed by the following one: 

I + kQ = 0, where k = 1 or 1/2 , principle II (internal) , (8) 

which we call the internal principle. The minus sign of Q ~ —I is not so strange as it seems. For example 
for a pure classical state the Shannon entropy goes to minus infinity and this means that an infinite amount 
of information should be taken to specify such a state exactly . 

In JS | the other approach to the structure information was presented. Frieden introduced the so called 
bound information J which has the interpretation of being confined in the system before the measurement. 
Although Frieden axioms are operationally similar to Eq.(|7]i and (0 if only J = —Q, yet the difference in 
the interpretations is obvious. As the system in Frieden interpretation exhibits during the measurement the 
transfer of information I — >■ J, having at any moment of time one of these two types of information only, 
in our scenario the system is characterized by / and Q simultaneously at any moment of time. 
At first look (it has lasted over 70 years) it seems that the Kline-Gordon and Dirac equations are more 
similar to each other than the Dirac and Maxwell ones. But, using Eq.(|7]i and Eq.© the Klein-Gordon 
equation is obtained whereas Eq.© alone gives the Dirac equation or Maxwell equations for k = 1 or 
k = 1/2, respectively (6). Hence the Dirac and Maxwell cases are more similar in their axiomatic origin. 
Yet the source of their difference in the k value is also important. The comparison of the cases of Dirac and 
Maxwell equations suggests that the ratio Q to / is in the Maxwell case twice as big as in the Dirac case. 
In this context one more puzzle is solved. In 1990 Sallhofer ifTTl completed the model of the (hydrogen) 
atom, based on the isomorphism between Maxwell and Dirac formalisms. He, in the Minkowski space, 
worked out the formal mathematical strong similarity (I do not call it identity) of electrodynamics and 
wave mechanics by means of which he proved that the hydrogen atom might be seen as the pair of mutually 
refracting electromagnetic waves. Previously this similarity was pointed out by Sakurai [11]. Starting from 
the Maxwell equations Sallhofer obtained as if the Dirac equation for the hydrogen atom but with twice 
as much components for the "electronic" field than there are in the original Dirac equation. The physical 
structural identification of some of these components gives four degrees of freedom, as for the Dirac field 
IfTTl . which means that the Maxwell equations are of more fundamental nature than the Dirac one. 

After the choice of the axiom I or II (which one to choose should be verified in the experiment), the 
calculations of Q which follow are sometimes tedious. The simplest case exists for the scalar particle 
with N = 2 (see |6|). So, we have the single complex wave function ip(x) in the position x space 
and its Fourier transform <fi(fi) in the momentum /i space. After choosing the internal principle dHJ 
with k = 1, I [^o(x)] + Q [</>o(/i)] = 0, which means that information is equally distributed among 
the kinetic and structural parts, the Fisher information / and the structural information Q are equal to 
= g/dx^^ and Q [<j>] = - ^ / d\i \i 2 <jf{ji) (f>(n), respectively. The wave functions 

ip and (f> satisfying this internal principle are ipo and <po, respectively. Yet to obtain the Klein-Gordon 
equation the scalar principle (0 should be used also. But, whether for the scalar, spinor or vector fielc0 
the proposed procedure leads to the proper information I and Q, giving in the result kinetic and structural 
actions and in the result equations of motion. This fact means that there is the statistical quantity (namely 
information) which precedes action and that there are the information principles (I or II) which stand before 
the variational principle of the total action. 

To obtain the Dirac or Maxwell equations the internal principle II is enough. 



© 2003 WILEY- VCH Verlag GmbH & Co. KGaA. Weinhcim 



6 



J. Syska: The Fisher's information and quantum models - classical field theory, classical statistics similarity 



4 Conclusions In the paper the classical statistics proof of the impossibility to derive quantum mechan- 
ics from classical mechanics has been presented. This statement has appeared as the conclusion from the 
fact that the Fisher information for different cases of the dimension N of the finite sample gives different 
field theories, hence none of them is equivalent to classical mechanics for which N is infinite. Physically 
it might be understood as the result of the fact that for the particular quantum model with N established, 
(in order to describe the state) the eigenvalue is needed only, but to have information on the position of 
the classical mechanics particle at every moment of time, infinite N is needed. To obtain any field theory 
two new principles were proposed, the scalar one connected with the minimization of total information in 
the system and the internal one on zeroing this total information. It was pointed out that the notion of in- 
formation stands before the usual physical action. Much of the work has been done previously by Frieden 
and Soffer [6] and it rightly might be called the Frieden approach to equations of motion, yet the method 
should be reinterpreted, particularly in understanding the structural information and developed in finding 
the information predecessors for sources and physics beyond the value of N. 

Finally, because for the construction of different field theory models (classical and quantum) the same 
formalism of statistical Fisher information has been used hence it is the tool to the construction of a self- 
consistent field theory also J8J, the one which joins the quantum theory and classical field theory in one 
logically consistent mathematical apparatus. 
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